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ABSTRACT: 

In  this  paper,  we  consider  the  problem  of  estimating  a  parameter  a  that  can  be 
expressed  as  a  nonlinear  function  of  sample  means.  We  develop  a  jackknife  estimator  for 
q  that  is  appropriate  to  computational  settings  in  which  the  total  computer  budget  to  be 
used  is  constrained.  Despite  the  fact  that  the  jackknifed  observations  are  not  i.i.d..  we 
are  able  to  show  that  our  jackknife  estimator  reduces  bias  without  increasing  asymptotic 
variance.  This  makes  the  estimator  particularly  suitable  for  small  sample  applications 
Because  a  special  case  of  this  estimator  problem  is  that  of  estimating  a  ratio  of  two  means, 
the  results  in  this  paper  are  pertinent  to  regenerative  steady-state  simulations. 
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ABSTRACT: 

In  this  paper,  we  consider  the  problem  of  estimating  a  parameter  a  that  can  be 
expressed  as  a  nonlinear  function  of  sample  means.  We  develop  a  jackknife  estimator  for 
a  that  is  appropriate  to  computational  settings  in  which  the  total  computer  budget  to  be 
used  is  constrained.  Despite  the  fact  that  the  jackknifed  observations  are  not  i.i.d.,  we 
are  able  to  show  that  our  jackknife  estimator  reduces  bias  without  increasing  asymptotic 
variance.  This  makes  the  estimator  particularly  suitable  for  small  sample  applications. 
Because  a  special  case  of  this  estimator  problem  is  that  of  estimating  a  ratio  of  two  means, 
the  results  in  this  paper  are  pertinent  to  regenerative  steady-state  simulations. 

KEYWORDS:  Bias,  jackknife,  regenerative  simulation 

1.  INTRODUCTION 

Consider  the  problem  of  estimating,  via  simulation,  the  parameter  a  =  g(ji)  where 
g  :  Rd  — ♦  R  is  a  (possibly)  nonlinear  function  and  p  is  expressible  as  the  mean  of  an  Re¬ 
valued  random  vector  X.  This  estimation  problem  arises  in  certain  terminating  simulation 
settings,  as  well  as  in  steady-state  regenerative  simulations  (see  Section  2  for  examples  of 
specific  applications).  This  paper  is  concerned  with  the  problem  of  estimating  a  when 
a  budget  constraint  t ,  representing  the  maximum  amount  of  computer  time  to  be  used, 
is  imposed  on  the  simulation.  Two  major  statistical  difficulties  arise  in  this  estimation 
setting.  Firstly,  the  number  of  replications  N(t)  of  the  r.v.  X  generated  within  the  budget 
constraint  t  is  random.  Consequently,  (deterministic)  fixed  sample  size  theory  does  not 
apply  to  this  class  of  estimation  problems.  The  second  difficulty  is  that  the  nonlinearity  in 
the  function  g  produces  bias  in  the  estimation  of  a.  A  standard  statistical  technique  used 
to  deal  with  bias  that  emanates  from  nonlinearities  of  this  kind  is  to  jackknife  the  estimator 
(see,  for  example,  MILLER  (1974)  and  IGLEHART  (1975)).  However,  the  jackknife  litera¬ 
ture  typically  requires  that  the  observations  being  jackknifed  be  i.i.d.  Unfortunately,  when 
a  budget  constraint  is  imposed,  the  observations  are  clearly  no  longer  independent.  The 
major  contribution  of  this  paper  is  to  show  that  the  dependence  induced  by  the  presence 
of  the  budget  constraint  does  not  destroy  the  bias-  reducing  properties  of  the  jackknife.  As 
indicated  earlier,  the  jackknifed  estimator  that  we  obtain  here  has  important  implications 
for  steady-state  regenerative  simulation.  In  particular,  the  estimator  obtained  here  has 
the  same  bias-reducing  properties  as  the  regenerative  low-bias  estimator  introduced  by 
MEKETON  and  HEIDELBERGER  (1982). 

This  paper  is  organized  as  follows.  In  Section  2,  we  precisely  describe  the  estimation 
problem  and  give  examples  of  various  applications  settings.  We  then  proceed  to  describe 
the  major  results  of  this  paper.  Section  3  discusses  the  empirical  behaviour  of  our  budget- 
constrained  jackknife.  We  defer  ail  proofs  to  Section  4. 
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2.  DESCRIPTION  OF  MAIN  RESULTS 

Suppose  that  X  is  an  Revalued  random  vector  with  finite  mean  /i.  As  stated  in  the 
Introduction,  our  goal  in  this  paper  is  to  estimate  a  =  subject  to  a  budget  constraint 
on  the  total  amount  t  of  computer  time  to  be  used.  Throughout  this  paper,  we  assume 
that  g  :  Rd  — ►  R  has  continuous  second  partial  derivatives  in  some  open  neighborhood  for 
fi.  This  type  of  estimation  problem  arises  in  several  different  problem  settings. 

EXAMPLE  1.  Given  a  stochastic  system  Y,  we  may  be  interested  in  the  performance 
of  the  system  over  some  (finite)  time  horizon  T  (T  may  be  deterministic  or  it  may  be 
a  stopping  time  with  respect  to  Y;  see  p.  322  of  CHUNG  (1974)  for  a  definition  of 
stopping  time).  Let  Z  be  a  real-valued  performance  measure  that  depends  on  Y  over  the 
above  time  horizon,  so  that  Z  can  be  represented  as  Z  =  f(Y(s)  :  0  <  s  <  T).  Estimation 
of  both  the  mean  and  variance  of  the  performance  measure  Z  are  special  cases  of  the 
estimation  problem  considered  in  this  paper.  To  incorporate  the  estimation  of  a  =  EZ 
into  our  framework,  we  set  X  =  Z  and  g(x)  =  x.  For  the  variance,  we  let  X  =  (Z2,Z) 
and  g{xi,i2)  =  x\  —  x\.  Note  that  in  the  case  of  the  variance,  g  is  a  nonlinear  mapping. 

EXAMPLE  2.  Suppose  that  Y  is  a  real-valued  non-delayed  regenerative  process  and 
that  Z  is  the  /3-discounted  cost  associated  with  Y ,  namely 

Z  =  f  exp (—0s)Y(s)  ds. 

Jo 

It  is  shown  in  FOX  and  GLYNN  (1989)  that  a  =  EZ  can  be  re-  expressed  as  a  =  g(EX), 
where 

X  =  (/  exp(— /?s)Y(s)  ds,  exp(— #77)^  , 

77  is  the  first  positive  regeneration  time  of  Y,  and  g{x\,  X2)  =  xi(l  —  X2)-1.  The  advantage 
of  this  alternative  representation  is  that  the  random  vector  X  can  be  generated  in  finite 
time,  whereas  Z  typically  can  not.  It  is  further  shown  in  FOX  and  GLYNN  (1989)  that 
the  expected  /^-discounted  cost  associated  with  a  delayed  regenerative  process  can  also  be 
expressed  in  the  form  a  =  g(EX),  although  the  precise  nature  of  X  and  g  is  then  more 
complicated. 

EXAMPLE  3.  Let  Y  be  a  non-delayed  real-valued  regenerative  process  and  suppose 
that  77  is  the  first  positive  regeneration  time  of  Y.  Regenerative  process  theory  (see,  for 
example,  SMITH  (1955))  shows  that  the  steady-state  mean  a  of  the  process  Y  can  be 
represented  as  the  ratio 

r 

a  =  E  Y(s)/Er). 

Jo 

The  above  ratio  estimation  problem  can  be  incorporated  into  our  set-up  by  letting 

X  =  (JV  Y(s)  ds,  77) 

and  putting  <7(xj,X2)  =  X\/x2-  Thus,  we  conclude  that  the  problem  of  estimating  the 
steady-state  mean  of  a  regenerative  stochastic  process  is  a  special  case  of  the  estimation 
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problem  considered  in  this  paper.  The  observation  that  steady-state  estimation  for  regen¬ 
erative  processes  is  a  special  case  of  ratio  estimation  lies  at  the  heart  of  the  regenerative 
method  for  steady-  state  simulation  output  analysis  (see  CRANE  and  LEMOINE  (1977) 
for  a  more  complete  description  of  the  method). 

EXAMPLE  4.  Suppose  that  rather  than  estimating  the  steady-state  mean  of  a  regener¬ 
ative  stochastic  process,  we  now  wish  to  estimate  the  steady-state  variance.  To  be  more 
precise,  suppose  that  Y  is  a  real-valued  non-delayed  regenerative  process.  Then,  under 
suitable  regularity  conditions  on  Y ,  there  exists  a  r.v.  Y *  such  that  Y(t)  =>  Y *  as  t  — ►  oo; 
Ym  has  the  steady-state  distribution  associated  with  the  regenerative  process  Y.  The  prob¬ 
lem  of  estimating  a  =  EY *  was  discussed  in  Example  3.  It  turns  out  that  the  regenerative 
structure  of  Y  can  also  be  fruitful  exploited  to  estimate  a  =  var  Y*.  More  specifically, 
a  =  g(EX).  where 

X=(J\2(s)  ds,  J*  Y(s)  ds,  t,y 

g  is  the  first  positive  regenerative  time  of  Y ,  and  g{x\. 12,13)  =  (xi/xy)  —  (X2/X4)2 .  The 
problem  of  estimating  a  =  var  Y *,  when  no  budget  constraint  is  present,  is  discussed  in 
GLYNN  and  IGLEHART  (1986). 

Returning  to  the  estimation  problem  at  hand,  our  estimators  are  obtained  by  gen¬ 
erating  i.i.d.  copies  X\ ,  X2, . . .  of  the  r.v.  X.  Suppose  that  r,  is  the  computer  time 
required  to  generate  Xj.  We  assume  throughout  this  paper  that  the  sequence  of  pairs 
{(A'n,r„)  :  n  >  1}  are  i.i.d.  However,  we  permit  Xn  and  r„  to  be  dependent  r.v.’s.  Indeed, 
in  most  applications,  X„  and  t„  will  be  strongly  correlated. 

n 

Given  a  budget  constraint  t ,  let  N(t)  =  max{n  >  0  :  53  r>  —  0  be  the  number  of 

1=1 

observations  completed  by  time  t.  The  classical  estimator  for  a  is  then  defined  by 

f  <?(*(*));  N(t)>i 

Q  (0  =  < 

(  0;  N(t)  =  0, 

_  JV(t)  _ 

where  X(t)  =  N(t )  1  53  adopt  the  convention  that  X(t)  =  0  if  N(t)  =  0.  This 

1=1 

estimator,  while  enjoying  reasonable  large-sample  behavior,  suffers  from  small-sample  bias 
that  can  substantially  degrade  the  performance  of  the  estimator.  To  precisely  describe  the 
bias  characteristics  of  a(t)  requires  some  control  on  the  growth  of  g. 

DEFINITION.  Let  ||  •  ||  be  the  Euclidian  norm  on  Rd.  We  say  that  g  is  polynomially 
dominated  to  degree  r  (r  >  0)  if  there  exists  constants  .4  and  B  such  that 

|ff(z)|  <  A  +  B\\x-n\\T 

for  all  x  6  C,  where  C  is  the  convex  hull  of  the  support  of  the  distribution  of  X  (see  p.  31 
of  CHUNG  (1974)  for  a  definition  of  the  support  of  an  FU- valued  r.v.). 


To  obtain  some  feel  for  this  condition,  let  us  note  that  if  g  is  bounded,  then  g  is 
poly nomi ally  dominated  to  degree  0.  This  also  occurs  when  the  Xi  s  lie  almost  surely 
in  some  set  on  which  g  is  bounded.  This  situation  is  illustrated  by  Example  3  when 
T'(>s)|  <  M,  in  which  case  |a(t)|  <  M.  Finally,  wenote  that  if  all  the  partial  derivatives 


of  g  of  order  r  are  globally  bounded  (i.e.  sup  j 

collections  {i\ , . . . ,  id)  such  that  ij  +  ...  +  id  = 
degree  r.  For  1  <  t,  j  <  d,  let 


U9(x) 


:  x  €  R  ‘  >  <  oo  for  all 

d'x\...dzif' 

r),  then  g  is  polynomially  dominated  to 


G=_*_ 

>J  dx,dij 


gif*) 


C ij  =  E  [(Xfc(t)  -  n(i))(Xk(j)  ~  rtj))} , 


where  Xk  =  (-X^jt(l),  •  •  •  ,Xk(d)),  y.  —  (m(1),  ■  •  •  ,n{d)).  Set  A  —  1/Er*. 

We  are  now  ready  to  state  our  small-sample  bias  expansion  for  a(t).  It  is  a  general¬ 
ization  (to  unbounded  g)  of  Theorem  4.3  of  GLYNN  and  HEIDELBERGER  (1990). 

THEOREM  1.  Suppose  that  g  is  polynomially  dominated  to  degree  r(r  >  0)  and  0  <  A  < 
oo.  Let  p  =  max(2,r).  If  there  exists  6  >  0  such  that  E\\X\\2r+1+s  <  oo  and  Et2?**  <  oo, 

,hen  ,  ‘  m 

Ea(t)  =  a+  -  +° 

as  t  — *  oo,  where 

\ -l  d  d 

b= 

.=i  j= i 

In  view  of  the  importance  of  steady-state  simulation,  w’e  provide  the  following  corollary. 
COROLLARY  1.  Consider  Example  3.  If  either: 

i)  sup{|F(s,  u>)|  :  s  >  0,  u  €  0}  <  oo,  Er]5+S  <  oo,  and  Et*+s  <  oo 

or  , 

ii)  there  exists  e  >  0  such  that  P{rj  >  e},  E||-Y||5+  <  oo,  and  Er  <  oo 

then 

Ea(t)  =  a  +  b/t  +  oil/t), 

where 

b=-X-1-(Er,y2-E^1J\Y(S)-a)  . 

This  generalizes  a  bias  expression  due  to  MEKETON  and  HEIDELBERGER  (1983). 
Their  discussion  assumes  that  r]  =  t  or,  equivalently,  that  the  amount  of  computer  time 
expended  to  generate  t  units  of  simulated  time  is  precisely  equal  to  t.  By  contrast,  we  are 
not  assuming  here  that  computer  time  and  simulated  time  are  identical. 
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In  addition  to  the  bias,  the  quality  of  an  estimator  is  also  largely  determined  by  its 
asymptotic  variability.  The  next  result  characterizes  the  central  limit  behavior  of  o(t);  its 
proof  appears  in  GLYNN  and  HEIDELBERGER  (1990)  as  part  of  Theorem  4.1. 

THEOREM  2.  Suppose  that  0  <  A  <  oc  and  that  f?||Ar||2  <  oo.  Then, 

t  >  (a(f)  —  a)  =>  crN(0, 1) 


as  t  — »  oo,  where  <j2  =  A  1  ^  ^ 

Assuming  that  a(f)  is  appropriately  uniformly  integrable,  Theorem  2  implies  that  var 
a(t)  ~  cr2 / #  as  t  — ►  oo  (when  cr2  >  0). 

Our  goal  is  now  to  produce  an  estimator  that  has  the  same  asymptotic  variability  as 
q (#),  but  has  improved  small-sample  bias  behavior.  It  is  well  known  that  one  effective 
technique  for  dealing  with  bias  associated  with  nonlinear  functions  of  sample  means  of 
i.i.d.  r.v.’s  is  to  jackknife  (see  MILLER  (1974)  and  GRAY  and  SCHUCANY  (1972)  for 
further  details).  Unfortunately,  the  observations  Xi,X?, . . .  ,Xpj(t)  generated  within  the 
budget  constraint  t  are  not  independent  (and  are  not  identically  distributed  as  a  function 
of  t ).  In  particular,  the  sum  of  the  N(t )  r,’s  is  constrained  to  be  less  than  or  equal  to  f, 
and  hence  the  Tj’s  (and  consequently  the  AYs)  are  dependent.  Nevertheless,  we  will  show 
that  the  dependency  induced  by  the  budget  constraint  is  mild  enough  that  the  jackknife 
estimator  continues  to  effectively  reduce  bias. 


Let 


Xi(t) 


«.-(*) 


aj(t) 


1 

N(t) 

X ;  Xs  ;  Ad)  >  2 

J  T* 

NW-  1 

.0 

\  N(t)  <  1, 

j  s  (x.d)) 

;  N(t)  >  2 

. 

;  N(t)  <  l 

N(t)a(t)  -  ( N(t )  -  l)a,(f) 


r 


N(t) 

^2 


i=i 


;  N(t)  >  1 


;  N(t)  =  0. 


Our  next  theorem  asserts  that  the  jackknife  estimator  a  j(t)  does  indeed  effectively 
reduce  the  small-sample  bias. 


THEOREM  3.  Suppose  that  g  is  polynomials  dominated  to  degree  r(r  >  0)  and  0  < 
A  <  oo.  If  there  exists  6  >  0  such  that  i?exp(6||Xi||)  <  oo  and  E exp(6ri)  <  oo,  then 


Ea(t)  =  q+  o(l/t) 
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as  t  — *  cc. 

In  particular,  Theorem  3  shows  that  the  jackknifed  estimator  aj(t)  is  an  effective 
technique  for  reducing  bias  in  regenerative  steady-state  simulations  (see  Example  3).  This 
estimator  provides  a  bias  reduction  that  is  qualitatively  identical  to  that  obtained  when 
using  the  low-bias  steady-state  estimator  proposed  by  MEKETON  and  HEIDELBERGER 
(1983).  We  note,  however,  that  in  contrast  to  the  estimator  proposed  there,  the  budget 
constraint  used  here  is  specified  in  terms  of  computer  time,  whereas  their  estimator’s 
budget  constraint  is  determined  by  simulated  time.  Of  course,  a  simulation  time  constraint 
is  just  a  special  case  of  a  computer  time  constraint  (just  set  77  =  r  in  Example  3).  Thus, 
the  estimator  aj(t)  is  a  more  generally  applicable  approach  to  obtaining  bias  reduction  in 
steady-state  regenerative  simulations. 

Our  next  goal  is  to  show  that  the  bias  reduction  obtained  by  using  a  j(t)  comes  at 
no  cost,  in  terms  of  additional  asymptotic  variability.  In  order  to  obtain  a  central  limit 
theorem  for  aj(t),  we  shall  need  to  apply  a  “random  time  charge”  argument  (to  pass 
from  the  discrete  time  scale  of  observations,  expressed  in  terms  of  n,  to  the  continuous 
parameter  computer  time  scale,  expressed  in  terms  of  t ).  This  basically  requires  proving  a 
functional  limit  theorem  for  the  jackknife  estimator.  Unfortunately,  we  have  been  unable 
to  find  such  a  limit  theorem  in  the  literature  (although  ordinary  central  limit  theorem  do 
appear;  see,  for  example,  MILLER  (1964)).  Thus,  part  i)  of  our  next  theorem  states  a 
“strong  approximation”  result  for  the  jackknife  estimator  a  j(n)  given  by 

=  j;  £M*»)  -  (n  -  i)s(X,n)], 

71  1=1 

_  n  _ 

where  JVn  =  n-1  Xj  an^  Xin  —  =  1  Xj/{n  —  1).  A  functional  limit  theorem  for 

;=1  i*1. 

aj(n)  follows  easily  from  the  strong  approximation  result  (see  CSORGO  and  REVESZ 
(1981),  and  is  stated  as  part  ii)  of  Theorem  4.  The  existence  of  this  functional  theorem 
also  guarantees  that  sequential  stopping  rules,  based  on  jackknifed  estimators,  are  asymp¬ 
totically  valid  (see  GLYNN  and  WHITT  (1989)).  Finally,  part  iii)  of  Theorem  4  is  the 
desired  central  limit  theorem  for  a  j(t )  and  follows  directly  from  part  ii)  as  a  consequence 
of  the  “random  time  change”  argument  previously  mentioned. 

THEOREM  4.  Suppose  that  E\\X\  ||p  <  00  for  some  p  >  3  and  that  0  <  A  <  00.  Then, 

i)  There  exists  a  probability  space  (Q,  T ,  P )  supporting  a  sequence  {a }(n)'  :  n  >  1}  and 
an  Revalued  Brownian  motion  B  (with  covariance  matrix  C  =  (CtJ  :  1  <  *,  j  <  d)) 
such  that: 

a)  {a*j(n)'  :  n  >  l}={a}(n)  :  n  >  1},  where  =  denotes  equality  in  distribution 

b)  Q}(n)'  =  a  +  ^g(p)B(n)/n  +  o  (n1/p_1)  a.s. 

ii)  Let  \n(t)  =  n1/2(a}(ni)  -  a)  and  *(f)  =  ^g{p)B(t)/t.  Then,  for  e  >  0, 

Xn  =»  X 
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in  the  Skorohod  space  D[e,  oo),  where  D[e,  oo)  is  the  space  of  right-continuous  func¬ 
tions  with  left  limits  (see  ETHIER  and  KLRTZ  (1986)  for  details  on  this  space). 

iii)  j(t)  —  a)  =>  crAr(0, 1)  as  t  — *■  oo,  where  a2  is  defined  as  in  Theorem  2. 

In  the  presence  of  appropriate  uniform  integrability  conditions,  we  find  that  part  iii) 
of  Theorem  4  asserts  that  var  atj(t)  ~  o 2 /t  as  t  — ►  oo  (provided  a2  >  0).  Contrasting 
Theorem  4  with  Theorem  2,  we  therefore  conclude  that  aj(t)  possesses  the  same  asymp¬ 
totic  variability  as  does  a(f).  Since  aj(t)  has  superior  small-sample  bias  characteristics, 
as  compared  to  a(<),  this  suggests  that  aj(t)  will  often  be  a  superior  estimator  to  a(t)  in 
small-sample  settings.  In  the  standard  i.i.d.  ratio  estimation  context,  this  type  of  behavior 
was  observed  empirically  by  IGLEHART  (1975). 
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3.  EMPIRICAL  RESULTS 


In  this  section,  we  discuss  the  results  of  simulation  experiments  that  compare,  em¬ 
pirically,  the  properties  of  various  estimators  including  the  jackknife.  We  will  restrict  our 
attention  to  the  problem  of  steady-state  estimation  in  two  queueing  systems:  the  M/M/1 
queue  and  a  simple  queueing  network  model  of  a  computer  system.  For  the  M/M/1  queue 
we  consider  estimation  of  the  mean  steady-state  waiting  tiem  E[W].  We  let  A  denote  the 
arrival  rate,  p  the  service  rate  and  define  p  =  \J  p  to  be  the  traffic  intensity.  In  our  exper¬ 
iments  we  set  p  =  0.9  which  we  obtain  with  A  =  0.9  and  p  =  1.0.  With  these  parameters, 
E\W]  =  (l/p)p/(l  —  p)  =  9.0.  We  select  the  empty  and  idle  state  to  be  our  regeneration 
state:  the  expected  number  of  customers  served  per  regenerative  cycle  is  1/(1  —  p)  =  10.0. 

The  queueing  network  model  is  shown  in  Figure  1.  There  are  five  queues,  one  repre¬ 
senting  a  CPU  and  four  representing  I/O  disk  drives.  The  service  discipline  at  all  queues 
is  FCFS  and  all  service  times  are  exponentially  distributed.  Jobs  arrive  to  the  system  at 
the  CPU  according  to  a  Poisson  process  with  rat  A.  Upon  leaving  the  CPU,  the  job  goes 
to  I/O  device  i  with  probability  pi  for  i  =  1, ...  ,4.  A  job  is  then  routed  back  to  the  CPU 
with  probability  p,  or  exits  the  system  with  probability  (1  —  p).  We  let  So  denote  the  mean 
service  time  of  a  job  at  the  CPU,  and  let  Si  denote  the  mean  service  time  of  a  job  at  the 
I/O  device  i.  Under  these  assumptions,  the  model  is  a  Jackson  network  having  a  product 
form  solution  (see  JACKSON  (1957)  or,  e.g.,  Chapter  3  of  LAVENBERG  (19S3))  so  that 
steady-state  performance  measures  are  easy  to  compute.  We  will  be  interested  in  estimat¬ 
ing  the  expected  system  response  time  Eji?]  which  is  the  total  expected  time  that  a  job 
spends  in  the  system.  For  our  experiments  we  set  A  =  1,  p,  =  0.25,  p  =  0.75,  S0  =  0.2,  and 

5,  =  0.5  for  i  =  1, _ 4.  With  these  parameter  settings,  £[7?]  =  8.0  and  the  steady-state 

utilization  of  the  CPU  is  0.8,  while  the  steady-state  utilization  of  each  I/O  device  is  0.5. 
The  probability  of  an  empty  system  (our  regeneration  state)  is  0.2  x  (0.5)4  =  0.0125,  which 
since  arrivals  are  Poisson  is  also  the  fraction  of  arrivals  to  an  empty  system.  Therefore, 
the  expected  number  of  arrivals  per  regenerative  cycle  is  80(=  1.0125).  We  used  the  IBM 
Research  Queueing  Package  (RESQ)  to  simulate  this  network  (see  SAUER.  MACNAIR 
and  KUROSE  (1984)). 

Let  t  denote  the  length  of  the  simulation  (in  some  time  unit).  We  will  consider  a 
variety  of  estimators,  some  having  "high”  bias  (i.e.,  bias  of  order  l/t),  and  others  having 
“low”  bias  (i.e.,  bias  of  order  o(l/t)).  Let  T„  =  T\  4-  . . .  +  r„  be  the  time  of  the  n-th 
regeneration  and  let  k(t)  denote  the  number  of  waiting  (response)  times  completed  by 
time  t  in  the  M/M/1  queue  (queueing  network,  respectively).  Let  d*(()  denote  the  sample 
average  of  all  waiting  (response)  times  completed  by  time  t.  For  example,  in  the  M/M/1 
queue 

*(«) 

4*<'> =  %T' 

Since  we  want  to  emphasize  the  dpeendence  of  the  point  estimates  on  the  number  of  cycles 
completed,  we  will  let  an  be  the  point  estimate  based  on  n  regenerative  cycles.  With  this 
notation,  cxw(t)  =  a(t),  and  ajv(«)+i  =  a(Tjv(t)+i )  is  the  point  estimate  based  on  N(t)  -I- 1 
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cycles  as  suggested  in  MEKETON  and  HEIDELBERGER  (1982).  This  estimator  has 
bias  of  order  o{l/t)  provided  the  steady-state  performance  measure  can  be  expressed  as 
o  =  E[Xk]/E[r]k\  where  T]k  =  cxrk  for  some  constant  c,  i.e.,  the  length  of  the  Jb-th  cycle  in 
simulated  time  (77* )  is  proportional  to  the  length  of  the  Ar-th  cycle  in  "CPU’*  time  (r*  ).  By 
“CPU”  time  we  mean  generally  the  time  unit  that  determines  the  length  of  the  simulation. 
Otherwise  it  has  bias  of  order  l/t  (see  GLYNN  and  HEIDELBERGER  (1990)  for  a  more 
complete  explanation  of  this  phenomenon).  Letting  N(t)  =  max(l,  N(t)),  then  is  a 

point  estimator  which  is  described  in  GLYNN  and  HEIDELBERGER  (1990):  a~(<)  has 
the  same  leading  term  (of  order  1//)  in  its  bias  expansion  as  does  a#(().  However,  using 
N(t)  observations  results  in  an  unbiased  estimate  of  a  simple  mean  value,  whereas  using 
either  N(t)  or  N(t)  +  1  observations  generally  results  in  a  biased  estimate  of  a  simple 
mean.  GLYNN  (1989)  has  proposed  an  adjustment  to  c*k(t)  which  can  sometimes  be  used 
to  reduce  bias:  we  let  Qjt(«)  denote  this  bias  adjusted  estimator.  When  it  can  be  applied, 
the  bias  of  Ok(t)  is  o(l/t)  as  opposed  to  the  order  l/t  bias  of  Qk(t)- 

In  the  M/M/1  queue  simulations,  we  let  t  be  the  number  of  customers  served.  In  this 
case  k(t)  =  t  and  r)k  =  rk  so  that  is  a  low  bias  estimator.  Also,  d*(t)  is  easy  to 

implement  in  this  case.  Thus  we  have  three  low  bias  estimators  {atk(t),  <*Ar(t)+i,  and  a  j(t)) 
and  three  high  bias  estimation  (a.vfi^a-^  and  djt(<))  in  this  case.  For  the  queueing 
network  model,  we  let  t  be  simulated  time.  In  this  case,  r)k  is  the  number  of  jobs  processed 
by  the  network  in  the  k- th  cycle  and  rk  is  the  length  of  the  A'-th  cycle  in  simulated  time. 
Since  we  are  no  longer  assured  that  77*  =  c  x  rkl  a,v(f)+ 1  has  bias  of  order  l/t.  In  addition, 
it  is  not  clear  that  d; t(<)  can  be  efficiently  implemented.  Thus  in  this  case,  we  have  only 
one  low  bias  estimator  (aj(t))  and  four  high  bias  estimators  (a;v(»).  Q  v(«)’  °N(0+i  a°d 

<**(»))• 

Tables  1  and  2  report  the  results  of  simulation  experiments  for  the  M/M/1  queue 
and  the  queueing  netowrk,  respectively.  For  each  t,  R  i.i.d.  replications  were  performed. 
For  each  estimator,  the  sample  average  over  the  R  replications  was  computed,  along  with 
an  estimate  of  its  standard  deviation.  These  can  be  used  to  form  confidence  intervals 
for  the  expected  value  of  an  estimator.  For  example,  from  Table  1,  an  approximate  90% 
confidence  interval  for  £[ajv(t)]  f°r  *  =  500  is  5.618  ±  1.645  x  0.015.  Since  EflW)  =  9.00, 
we  conclude  that  a.v(t)  is  biased  for  this  value  of  t.  In  Table  1,  the  results  are  as  expected: 
the  low  (theoretical)  bias  estimators  axe  generally  closer  to  the  steady-stage  value  than  are 
the  high  bias  estimators.  The  only  exception  is  that  aj(t)  has  more  bias  than  d*(t)  for  the 
shortest  run  (t  =  500).  The  jackknife  also  has  higher  variance  than  the  other  estimators 
for  short  runs.  For  intermediate  to  long  runs,  the  low  bias  estimators  actually  do  reduce 
bias  without  a  significant  icrease  in  variance. 

Similar  observations  can  be  obtained  from  Table  2  for  the  queueing  network  model. 
Although  only  the  jackknife  has  provably  low  bias  in  this  example,  a^(«)+i  also  performs 
well. 
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4.  PROOFS 

Proof  of  Theorem  1.  Fix  e  >  0  such  that  g  has  bounded  third  partial  derivatives  over 
the  set  {a:  :  ||x  -  p\\  <  e}.  Let  I(t)  =  J(||A(t)  -  ^||  >  e)  and  put  Ic(t)  =  1  -  I(t).  Since 

n 

0  <  A  <  oo,  it  follows  that  N(t)  — >  oo  a.s.  Furthermore,  n_1  ~ 1 *  M  a-s-  by  the  strong 

i=i 

law  of  large  numbers.  Hence,  it  is  evident  that  X(t)  — ♦  p  a.s.  as  t  — *  oo,  so  that  I(t)  =  0 
for  t  sufficiently  large  a.s.  We  will  now  show  that  Ea(t)I(t )  =  o(l/t). 

We  first  note  that  since  g  is  polynomially  dominated  to  degree  r  and  X(t)  6  C,  we 
have  that 

(4.1)  E\a(t)\I(t)  <  AEI(t)  +  BE\\X(t)  -  /i|| rI(t). 

Now,  on  {I(t)  =  1},  || X(t)  -  n\\/e  >  1  so 

(4.2)  Em  <  [<''jiix(()  -  ^r/(o]  • 

Also,  a  similar  argument  shows  that 

(4.3)  E||X(0  -  pf/(t)  <  [i'/2||X(()  -  rf'/m]  . 

By  Theorem  4.2  of  GLYNN  and  HEIDELBERGER  (1990),  {tP'2\\X{t)  -  /x||p  :  t  >  t0}  is 
uniformly  integrable  for  some  finite  t0  (the  argument  given  there  easily  extends  to  non¬ 
integer  values  of  p).  Since  I{t)  — ►  0  a.s.,  it  follows  that 

(4.4)  E  [t’l* ||X(()  -  ell »/(«)]  =  0(1) 

as  t  — ►  oo.  Combining  (4.1)-(4.4),  we  conclude  that  E|a(f)|/(t)  =  o(t~p/2)  a s  t  — ►  oo. 

We  now  turn  to  Ea(t)Ic(t).  However,  for  that  term,  we  can  apply  exactly  the  same 
argument  as  that  used  to  prove  Theorem  4.3  of  GLYNN  and  HEIDELBERGER  (1990). 
It  shows  that  Ea(t)Ic(t )  =  a  +  b/t  +  o{l/t)  as  t  ->  oo.  This  clearly  completes  the  proof  of 
our  result. 

Proof  of  Theorem  3.  For  purposes  of  simplifying  our  notation,  we  confine  our  presen¬ 
tation  to  the  case  where  d  =  1.  The  general  case  can  be  attacked  in  precisely  the  same 
way  as  the  scalar  case. 

We  start  by  showing  that  A’,„  is  uniformly  close  to  An,  1  <  i  <  n  (this  is  a  stan¬ 
dard  technique  in  the  jackknifing  literature;  however,  the  standard  references  only  give 
uniformity  in  probability,  rather  than  with  probability  one).  We  first  note  that  for  n  >  1, 

(4.5)  (n  —  1)(A  n  —  A  in) 

—nXn  -  (n  —  1)A,„  —  A„ 

=  £  Xj-Xn  =  Xi-Xn. 
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Hence, 


(4.6) 


max  \X„  -  X,„|  < - -  max  |X,-.Yn| 

l<i<n  n  —  1  l<»<n 

-  1  max  \Xi\  +  -±-\1Cn\ 


n  —  1  !<•<« 


n  -  1 


If  E\X\P  <  oo,  it  follows  that  n  1  53  |A,|P  — ♦  E\X\P  <  oo  a.s. 

»=i 

|.Y„|p/n  — *  0  a.s.,  from  which  it  is  evident  that  ^ma^c  |A',| p/n 
o(nl/p)  a.s.  Thus,  we  conclude  that  if  E\X\P  <  oc, 


as  n  — ►  oo.  Consequently, 
— »  0  a.s.  i.e.  max  LY,|  = 

1  <:<n 


(4.7) 


max  |-Y„  —  A'i„|  =  o(n1^p  *)  a.s. 

1  <»<n 


We  now  proceed  to  prove  our  theorem.  Fix  e  >  0  so  that  g  is  twice  continuously  differen¬ 
tiable  on  {x  :  |x  —  n\  <  3e}.  Since  N(t)  — ♦  oo  a.s.,  (4.7)  implies  that  Ic(t )  — »  1  a.s.,  where 
7c(f)  =  7(|A(f)  -  n\  <  e,  |A'(t)  -  A;(f)|  <  e,  1  <  i  <  N(t),  N(t)  >  2).  We  now  write 
Eaj(t)  as  Ea j(t)  =  Eaj(t)Ic(t)  +  Eaj(t)I(t)<  where  I(t)  =  1  -  Ic(t),  and  analyze  each 
piece  separately. 

If  |An  —  p|  <  e,  | Xin  —  A’„|  <  e,  we  can  expand  g(X,n)  in  a  Taylor  series  about  Xn , 
yielding 

ng(Xn)  -  (n  -  l)ff(Ain) 

=?(Xn)-(n-l)[S(Xin)- 5(^)1 

=g(Xn)  -  (n  -  l)[?'(An)(A,„  -  Xn)  +  ^f^(X,n  -  An)2], 

where  lies  between  A'„  and  X We  further  note  that  the  above  expansions  are  uniform 
in  i  (i.e.  max{|£,„  —  A'„|  :  1  <  i  <  n}  — *•  0  a.s.  as  n  — ►  oo)  by  (4.7).  Hence,  for  n  sufficiently 
large,  (4.5)  implies  that  we  have 


(4.8) 


aj(n)  =  g( Xn)  +  g'(X„)  •  ±  £(X.  -  An) 


i=i 


= s(x„)  -  £t"(  u*Xi-x.?n- 

'  i=l 


Since  aj(t)  =  a}(N(t)),  we  obtain 


Eaj(t)Ic(t)  =  Ea(t )  -  Eg(X{t))I{t) 

1  N(<)  _ 

-  E  I  .  -  £  p"(f, «„)(*,  -  X{t)fUt)/o 


JV(()(W(()  -  1) 


1=1 
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Theorem  1  implies  that  Ea(t)  =  a+b/t+o(l/t).  We  will  shorter  that  EN(t)\g(X(t))\I(t)  - 
o(l/t).  Since  g(X(t))  =  0  on  {.V(f)  =  0},  it  is  clear  that  |$(A(f))|  <  X(t)\g{X(t))\.  Thus, 
the  estimate  on  EN(t)\g(X{t))\I(t)  will  show  that  Eg{X(t))I{t)  =  o(l/t).  As  for  the  third 
term  on  the  right-hand  side  of  (4.9),  note  that 

-+\~1g"(fJi)E(X1-n)2/2  =  b  a.s. 

as  t  -*  x.  Thus,  we  will  be  able  to  conclude  that  Eaj{t)Ic{t)  =  a  +  o(l/t),  once  we  show 
that  the  third  turn,  when  multiplied  by  t,  is  uniformly  integrable.  Let  M  =  max{|ff  (i)|  : 
\x-  n\<  3e}.  Then,  the  left-hand  side  of  (4.10)  is  dominated  by 

This  process  clearly  converges  a.s.  to  MX~1E(X\  -  nf.  We  claim  that  it  is  appropriately 
uniformly  integrable.  Using  the  conditional  exchangeability  of  A'i, . . . ,  A'.v(t)  (see  Section 
2  of  GLYNN  and  HEIDELBERGER  (1990)),  we  find  that  the  expectation  of  (4.11)  is 

(4.12)  MtE[N(t)-\Xi  -A(i))2;  N(t)>  1]. 

To  show  the  uniform  integrability  of  (4.11),  it  suffices  to  show  that  (4.12)  converges  to 
M\~lE(Xi  -nY-  But 

E[(t/N(t)f(X !  -x (f))4;  X{t)  >  l] 

<E±  [( t/N(t ))4;  N(t)  >  1]  •  [2*E(Xx  -  M)8  +  2»E(X(t)  -  /r)8]  = , 

which  is  bounded  in  f  by  Corollary  3.3  and  Theorem  4.2  of  GLYNN  and  HEIDELBERGER 
(1990).  Since  the  process  appearing  in  (4.12)  has  uniformly  bounded  second  moment,  this 
implies  that  (4.12)  does  indeed  converge  to  MA_1£(Ai  -£f)2-  Hence,  (4.11)  (and  thus 
(4.10))  are  also  uniformly  integrable.  Consequently,  Eaj(t)Ic(t )  =  a  -(-  o(l/f). 

To  complete  the  proof,  we  need  to  show  that  Eaj(t)I(t )  =  o(l/t).  Now,  the  “na¬ 
tional  exchangeability  of  A, , ,  A,v(f)  implies  that  each  of  the  terms  o,-(0(l  <  t  <  N(t)) 
have  identical  distributions  when  conditional  on  N(t).  Because  /(f)  is  symmetric  in 
A"i , . .  • ,  X at(i)  )  we  conclude  that 

Eaj(t)I(t)  =  Ecn(t)I(t). 

Since  g  is  polynomially  dominated  to  degree  r ,  we  find  that 

(4.13)  \&xW(t)  <  N(t)\g(X(t))\I(t)  +  N(t)\g(X1mm 

<  N(t)[2AI(t)  +  B(\X(t)  -  fi |r  +  lA'Uf)  -  /ill- 
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Also,  we  have  that  for  q  >  0, 


N(t) 

7(f)  <  I(\X(t)  -fx\>e)  +  Y  J(W)  “  *i(*)l  >  «.^(0  Z  2) 

i=i 

+  W)<1) 

A’(«) 

<  | X(t)  -  n\'lqq  +  Yj  W)  -  Xi(t)\'/e*I(N(i)  >  2)  +  I(n  +  r2  >  f) 

i=i 

MO 

<  |X(t)  -  n\ v??  +  2«  |Xi  -  X(<)r/iV(<)^^(iV(f)  >  l)  +  J(n  +  7-2  >  t), 

1=1 

where  we  used  (4.5)  and  the  fact  that  (N{t)  -  l)"1  <  2 jV(f)"1  on  {.V(f)  >  2}  to  obtain 
the  last  inequality.  Similarly,  we  find  that 

(4.15)  | Xx(f)  -  p|r  <  2r| X1(t)  -  X(t)\rJ  2r|X(f)  -  p|r 

<  22rN(t)~r\X1  -  X(f)|r  +  2r|A-(t)  -  Hr. 


Now,  the  conditional  exchangeability  of  Xi,. . .  ,X^(t)  guarantees  that 

MO 

(4.16)  E  Y  I  Xi  -  XiWNW'-Um)  >  1) 

i=i 

=£[iV(f)2-»|X1  -  X(f)|9;  N(t)  >  1]. 

Note  that  if  g  is  polynomially  dominated  to  degree  r,  then  it  is  also  polynomially  dominated 
to  degree  r',  for  any  r1  >  r.  Hence,  we  have  the  freedom  to  choose  r  and  q  as  large  as 
we  wish  in  (4.13)-(4.16).  Combining  (4.13)  to  (4.16),  we  therefore  find  that  in  order  to 
conclude  that  £|a1(f)|7(t)  =  o(l/f),  it  suffices  to  show  that 


(4.17) 

E\X(t)-tfN(t)  =  o(l/t) 

(4.18) 

E\Xx  -  X(t)\pN(t)2~p  =  o| 

(4.19) 

P{t\  4-  r2  >  f }  =  o(l/<) 

for  p  sufficiently  large.  To  obtain  (4.17),  we  apply  the  Cauchy-  Schwarz  inequality.  Theorem 
4.2  of  GLYNN  and  HEIDELBERGER  (1990),  and  Theorem  2.3  of  JANSON  (1983)  (to 
conclude  that  E%N2(t)  =  0(f)).  Relation  (4.18)  is  handled  similarly  to  (4.12)  (choose 
p  >  4).  Finally,  to  obtain  (4.19),  note  that 

P{ti  +  r2  >  f}  <  e~tt(E exp(^ri))2. 

Hence,  (4.17)  -  (4.19)  jure  proved,  showing  that  Ea\{t)I{t )  =  o(l/f)  and  completing  the 
proof  of  the  theorem. 
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Proof  of  Theorem  4.  As  in  the  proof  of  Theorem  3,  we  specialize  to  d  =  1  in  order  to 
simplify  the  notation;  the  general  case  follows  precisely  the  same  form  as  the  scalar  proof. 

n 

Our  starting  point  is  (4.8).  Noting  that  n-1  £3  —  X„)2  — ►  g"(fi)E(Xi  —  /x)2 

i=i 

a.s.  as  n  — >  oo,  we  conclude  that 

(4.20)  Qj(n)  =  ff(Jfn)  +  0(n-1)  a.s. 
as  n  -*  oo.  We  now  expand  <7(A„)  in  a  Taylor  series  about  fi: 

(4.21)  g(Xn)  =  a  +  g\n)(Xn  -  /x)  +  9-^-(Xn  -  /x)2, 

where  £„  lies  between  Xn  and  /x.  Now,  by  the  law  of  the  iterated  logarithm,  A'„  -  n  = 
0(  \/log  log  n/n)  a.s.  Combining  (4.20)  and  (4.21),  we  then  get 

(4.22)  atj{n )  =  a  +  g'(fi)(X„)  -  fi)  +  o(n1/p_1 )  a.s. 

as  n  — »  oc.  We  now  appeal  to  Theorem  1  of  EINMAHL  (1989)  to  guarantee  existence 

'D 

of  a  probability  space  (fi,  T,  P)  supporting  r.v.’s  {.Y(,  :  n  >  1}  =  {.Y„  :  n  >  1}  such  that 
A'(  +  . . .  +  X'n  —  n/x  =  B(n)  +  o(n1/p)  a.s.  (In  what  follows,  we  adopt  the  usual  tradition  of 
assuming  (f l.T.P)  as  our  original  probability  space;  this  can  be  done  without  any  essential 
loss  of  generality.)  Hence 

(4.23)  Xn  -  n  =  +  o(n1/p_1)  a.s. 

n 

Combining  (4.22)  and  (4.23)  yields  part  i)  of  the  theorem. 

Obtaining  part  ii)  is  now  easy.  Note  that 

Xn(t)  =  \7g(n)B(nt)/n1/2t  +  o  (n1/p-1/2t  p -1 )  a.s. 

but  5(n-)/n1''2=5(-)  and  the  term  o  (n1/p~1/2ti~1^j  goes  to  zero  uniformly  in  t  >  e, 
proving  the  result.  Part  iii)  is  now  an  immediate  consequence  of  a  standard  “random  time 
change”  argument  (see  BILLINGSLEY  (1968),  p.  44). 
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Table  1 


Means  and  Standard  Deviations  in  Simulations  of  the  M/M/1  Queue 
with  p  =  0.9,  £[W]  =  9.0 


t  —  500  <  =  1,000  <  =  2,500  <  =  5,000  <  =  10,000 

R  =  40, 000  R  =  20,000  i?  =  20,000  R  =  20,000  i?  =  10,000 


<**«) 

7.319 

0.022 

8.049 

0.030 

5.618 

0.015 

7.061 

0.024 

Q.V(«) 

5.646 

0.016 

7.063 

0.024 

0-N(t)+\ 

8.113 

0.025 

8.584 

0.032 

6.796 

0.025 

8.123 

0.037 

8.211 

0.032 

8.611 

0.038 

8.614 

8.796 

8.891 

0.023 

0.017 

0018 

8.238 

8.615 

8.800 

0.022 

0.017 

0.017 

8.238 

8.615 

8.800 

0.022 

0.017 

0.017 

8.893 

8.956 

8.971 

0.024 

0.018 

0.018 

8.848 

8.943 

8.970 

0.029 

0.020 

0.019 

8.883 

8.945 

8.971 

0.026 

0.018 

0.018 
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Table  2 


Means  and  Standard  Deviations  in  Simulations  of  the  Queueing  Network  Model 

of  a  Computer  System  =  8.0 


t  =  500 

R  =  4,000 

t  =  1,000 

R  =  2,000 

t  =  2, 000 

R  =  1,000 

t  =  4,000 
R  =  500 

<**«) 

7.620 

0.025 

7.819 

0.028 

7.943 

0.028 

7.943 

0.029 

C*N(0 

6.926 

0.031 

7.628 

0.030 

7.885 

0.029 

7.917 

0.030 

aN(f) 

7.090 

0.028 

7.639 

0.029 

7.SS5 

0.029 

7.917 

0.030 

7.870 

0.024 

7.969 

0.027 

8.004 

0.028 

7.977 

0.029 

<*./(*) 

7.860 

0.046 

8.054 

0.041 

8.049 

0.033 

7.991 

0.031 
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Figure  I.  Queueing  Network  Model  of  a  Simple  Computer  System 
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